Serveur d'exploration sur Pittsburgh

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Model selection for degree-corrected block models

Identifieur interne : 000239 ( France/Analysis ); précédent : 000238; suivant : 000240

Model selection for degree-corrected block models

Auteurs : Xiaoran Yan [États-Unis] ; Cosma Shalizi [États-Unis] ; Jacob E. Jensen [États-Unis] ; Florent Krzakala [France] ; Cristopher Moore [États-Unis] ; Lenka Zdeborová [France] ; Pan Zhang [États-Unis] ; Yaojia Zhu [États-Unis]

Source :

RBID : PMC:4498413

Abstract

The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or blocks of nodes with homogeneous patterns of links to the rest of the network. The standard tool for undertaking this is the stochastic block model, under which the probability of a link between two nodes is a function solely of the blocks to which they belong. This imposes a homogeneous degree distribution within each block; this can be unrealistic, so degree-corrected block models add a parameter for each node, modulating its overall degree. The choice between ordinary and degree-corrected block models matters because they make very different inferences about communities. We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs. We also develop linear-time approximations for log-likelihoods under both the stochastic block model and the degree-corrected model, using belief propagation. Applications to simulated and real networks show excellent agreement with our approximations. Our results thus both solve the practical problem of deciding on degree correction and point to a general approach to model selection in network analysis.


Url:
DOI: 10.1088/1742-5468/2014/05/P05007
PubMed: 26167197
PubMed Central: 4498413


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:4498413

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Model selection for degree-corrected block models</title>
<author>
<name sortKey="Yan, Xiaoran" sort="Yan, Xiaoran" uniqKey="Yan X" first="Xiaoran" last="Yan">Xiaoran Yan</name>
<affiliation wicri:level="4">
<nlm:aff id="A1">Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
<settlement type="city">Los Angeles</settlement>
</placeName>
<orgName type="university">Université de Californie du Sud</orgName>
</affiliation>
</author>
<author>
<name sortKey="Shalizi, Cosma" sort="Shalizi, Cosma" uniqKey="Shalizi C" first="Cosma" last="Shalizi">Cosma Shalizi</name>
<affiliation wicri:level="4">
<nlm:aff id="A2">Statistics Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Statistics Department, Carnegie Mellon University, Pittsburgh, PA 15213</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Jensen, Jacob E" sort="Jensen, Jacob E" uniqKey="Jensen J" first="Jacob E" last="Jensen">Jacob E. Jensen</name>
<affiliation wicri:level="2">
<nlm:aff id="A3">Computer Science Department, Stanford University, Stanford, CA 94305, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computer Science Department, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Krzakala, Florent" sort="Krzakala, Florent" uniqKey="Krzakala F" first="Florent" last="Krzakala">Florent Krzakala</name>
<affiliation wicri:level="3">
<nlm:aff id="A4">Ecole Normale Supérieure, F-75005 Paris, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Ecole Normale Supérieure, F-75005 Paris</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Moore, Cristopher" sort="Moore, Cristopher" uniqKey="Moore C" first="Cristopher" last="Moore">Cristopher Moore</name>
<affiliation wicri:level="2">
<nlm:aff id="A5">Santa Fe Institute, Santa Fe, NM 87501, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Santa Fe Institute, Santa Fe, NM 87501</wicri:regionArea>
<placeName>
<region type="state">Nouveau-Mexique</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zdeborova, Lenka" sort="Zdeborova, Lenka" uniqKey="Zdeborova L" first="Lenka" last="Zdeborová">Lenka Zdeborová</name>
<affiliation wicri:level="3">
<nlm:aff id="A6">Institut de Physique Théorique, CEA Saclay and URA 2306, CNRS, F-91190 Gif-sur-Yvette, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Institut de Physique Théorique, CEA Saclay and URA 2306, CNRS, F-91190 Gif-sur-Yvette</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Gif-sur-Yvette</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Pan" sort="Zhang, Pan" uniqKey="Zhang P" first="Pan" last="Zhang">Pan Zhang</name>
<affiliation wicri:level="2">
<nlm:aff id="A5">Santa Fe Institute, Santa Fe, NM 87501, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Santa Fe Institute, Santa Fe, NM 87501</wicri:regionArea>
<placeName>
<region type="state">Nouveau-Mexique</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zhu, Yaojia" sort="Zhu, Yaojia" uniqKey="Zhu Y" first="Yaojia" last="Zhu">Yaojia Zhu</name>
<affiliation wicri:level="2">
<nlm:aff id="A7">Microsoft, Bellevue, WA 98004, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Microsoft, Bellevue, WA 98004</wicri:regionArea>
<placeName>
<region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">26167197</idno>
<idno type="pmc">4498413</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4498413</idno>
<idno type="RBID">PMC:4498413</idno>
<idno type="doi">10.1088/1742-5468/2014/05/P05007</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">001093</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Corpus" wicri:corpus="PMC">001093</idno>
<idno type="wicri:Area/Pmc/Curation">001068</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Curation">001068</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000C71</idno>
<idno type="wicri:explorRef" wicri:stream="Pmc" wicri:step="Checkpoint">000C71</idno>
<idno type="wicri:source">PubMed</idno>
<idno type="wicri:Area/PubMed/Corpus">002D48</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">002D48</idno>
<idno type="wicri:Area/PubMed/Curation">002D29</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">002D29</idno>
<idno type="wicri:Area/PubMed/Checkpoint">002D29</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">002D29</idno>
<idno type="wicri:Area/Ncbi/Merge">004917</idno>
<idno type="wicri:Area/Ncbi/Curation">004917</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">004917</idno>
<idno type="wicri:Area/Main/Merge">002184</idno>
<idno type="wicri:Area/Main/Curation">002096</idno>
<idno type="wicri:Area/Main/Exploration">002096</idno>
<idno type="wicri:Area/France/Extraction">000239</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Model selection for degree-corrected block models</title>
<author>
<name sortKey="Yan, Xiaoran" sort="Yan, Xiaoran" uniqKey="Yan X" first="Xiaoran" last="Yan">Xiaoran Yan</name>
<affiliation wicri:level="4">
<nlm:aff id="A1">Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Information Sciences Institute, University of Southern California, Marina del Rey, CA 90292</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
<settlement type="city">Los Angeles</settlement>
</placeName>
<orgName type="university">Université de Californie du Sud</orgName>
</affiliation>
</author>
<author>
<name sortKey="Shalizi, Cosma" sort="Shalizi, Cosma" uniqKey="Shalizi C" first="Cosma" last="Shalizi">Cosma Shalizi</name>
<affiliation wicri:level="4">
<nlm:aff id="A2">Statistics Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Statistics Department, Carnegie Mellon University, Pittsburgh, PA 15213</wicri:regionArea>
<placeName>
<region type="state">Pennsylvanie</region>
<settlement type="city">Pittsburgh</settlement>
</placeName>
<orgName type="university">Université Carnegie-Mellon</orgName>
</affiliation>
</author>
<author>
<name sortKey="Jensen, Jacob E" sort="Jensen, Jacob E" uniqKey="Jensen J" first="Jacob E" last="Jensen">Jacob E. Jensen</name>
<affiliation wicri:level="2">
<nlm:aff id="A3">Computer Science Department, Stanford University, Stanford, CA 94305, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Computer Science Department, Stanford University, Stanford, CA 94305</wicri:regionArea>
<placeName>
<region type="state">Californie</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Krzakala, Florent" sort="Krzakala, Florent" uniqKey="Krzakala F" first="Florent" last="Krzakala">Florent Krzakala</name>
<affiliation wicri:level="3">
<nlm:aff id="A4">Ecole Normale Supérieure, F-75005 Paris, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Ecole Normale Supérieure, F-75005 Paris</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Paris</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Moore, Cristopher" sort="Moore, Cristopher" uniqKey="Moore C" first="Cristopher" last="Moore">Cristopher Moore</name>
<affiliation wicri:level="2">
<nlm:aff id="A5">Santa Fe Institute, Santa Fe, NM 87501, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Santa Fe Institute, Santa Fe, NM 87501</wicri:regionArea>
<placeName>
<region type="state">Nouveau-Mexique</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zdeborova, Lenka" sort="Zdeborova, Lenka" uniqKey="Zdeborova L" first="Lenka" last="Zdeborová">Lenka Zdeborová</name>
<affiliation wicri:level="3">
<nlm:aff id="A6">Institut de Physique Théorique, CEA Saclay and URA 2306, CNRS, F-91190 Gif-sur-Yvette, France</nlm:aff>
<country xml:lang="fr">France</country>
<wicri:regionArea>Institut de Physique Théorique, CEA Saclay and URA 2306, CNRS, F-91190 Gif-sur-Yvette</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Île-de-France</region>
<settlement type="city">Gif-sur-Yvette</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zhang, Pan" sort="Zhang, Pan" uniqKey="Zhang P" first="Pan" last="Zhang">Pan Zhang</name>
<affiliation wicri:level="2">
<nlm:aff id="A5">Santa Fe Institute, Santa Fe, NM 87501, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Santa Fe Institute, Santa Fe, NM 87501</wicri:regionArea>
<placeName>
<region type="state">Nouveau-Mexique</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Zhu, Yaojia" sort="Zhu, Yaojia" uniqKey="Zhu Y" first="Yaojia" last="Zhu">Yaojia Zhu</name>
<affiliation wicri:level="2">
<nlm:aff id="A7">Microsoft, Bellevue, WA 98004, USA</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Microsoft, Bellevue, WA 98004</wicri:regionArea>
<placeName>
<region type="state">Washington (État)</region>
</placeName>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Journal of statistical mechanics (Online)</title>
<idno type="eISSN">1742-5468</idno>
<imprint>
<date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p id="P1">The proliferation of models for networks raises challenging problems of model selection: the data are sparse and globally dependent, and models are typically high-dimensional and have large numbers of latent variables. Together, these issues mean that the usual model-selection criteria do not work properly for networks. We illustrate these challenges, and show one way to resolve them, by considering the key network-analysis problem of dividing a graph into communities or blocks of nodes with homogeneous patterns of links to the rest of the network. The standard tool for undertaking this is the stochastic block model, under which the probability of a link between two nodes is a function solely of the blocks to which they belong. This imposes a homogeneous degree distribution within each block; this can be unrealistic, so degree-corrected block models add a parameter for each node, modulating its overall degree. The choice between ordinary and degree-corrected block models matters because they make very different inferences about communities. We present the first principled and tractable approach to model selection between standard and degree-corrected block models, based on new large-graph asymptotics for the distribution of log-likelihood ratios under the stochastic block model, finding substantial departures from classical results for sparse graphs. We also develop linear-time approximations for log-likelihoods under both the stochastic block model and the degree-corrected model, using belief propagation. Applications to simulated and real networks show excellent agreement with our approximations. Our results thus both solve the practical problem of deciding on degree correction and point to a general approach to model selection in network analysis.</p>
</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
<li>États-Unis</li>
</country>
<region>
<li>Californie</li>
<li>Nouveau-Mexique</li>
<li>Pennsylvanie</li>
<li>Washington (État)</li>
<li>Île-de-France</li>
</region>
<settlement>
<li>Gif-sur-Yvette</li>
<li>Los Angeles</li>
<li>Paris</li>
<li>Pittsburgh</li>
</settlement>
<orgName>
<li>Université Carnegie-Mellon</li>
<li>Université de Californie du Sud</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Californie">
<name sortKey="Yan, Xiaoran" sort="Yan, Xiaoran" uniqKey="Yan X" first="Xiaoran" last="Yan">Xiaoran Yan</name>
</region>
<name sortKey="Jensen, Jacob E" sort="Jensen, Jacob E" uniqKey="Jensen J" first="Jacob E" last="Jensen">Jacob E. Jensen</name>
<name sortKey="Moore, Cristopher" sort="Moore, Cristopher" uniqKey="Moore C" first="Cristopher" last="Moore">Cristopher Moore</name>
<name sortKey="Shalizi, Cosma" sort="Shalizi, Cosma" uniqKey="Shalizi C" first="Cosma" last="Shalizi">Cosma Shalizi</name>
<name sortKey="Zhang, Pan" sort="Zhang, Pan" uniqKey="Zhang P" first="Pan" last="Zhang">Pan Zhang</name>
<name sortKey="Zhu, Yaojia" sort="Zhu, Yaojia" uniqKey="Zhu Y" first="Yaojia" last="Zhu">Yaojia Zhu</name>
</country>
<country name="France">
<region name="Île-de-France">
<name sortKey="Krzakala, Florent" sort="Krzakala, Florent" uniqKey="Krzakala F" first="Florent" last="Krzakala">Florent Krzakala</name>
</region>
<name sortKey="Zdeborova, Lenka" sort="Zdeborova, Lenka" uniqKey="Zdeborova L" first="Lenka" last="Zdeborová">Lenka Zdeborová</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Amérique/explor/PittsburghV1/Data/France/Analysis
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000239 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/France/Analysis/biblio.hfd -nk 000239 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Amérique
   |area=    PittsburghV1
   |flux=    France
   |étape=   Analysis
   |type=    RBID
   |clé=     PMC:4498413
   |texte=   Model selection for degree-corrected block models
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/France/Analysis/RBID.i   -Sk "pubmed:26167197" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/France/Analysis/biblio.hfd   \
       | NlmPubMed2Wicri -a PittsburghV1 

Wicri

This area was generated with Dilib version V0.6.38.
Data generation: Fri Jun 18 17:37:45 2021. Site generation: Fri Jun 18 18:15:47 2021